Vocal tract representation in the recognition of cerebral palsied speech.
نویسندگان
چکیده
PURPOSE In this study, the authors explored articulatory information as a means of improving the recognition of dysarthric speech by machine. METHOD Data were derived chiefly from the TORGO database of dysarthric articulation (Rudzicz, Namasivayam, & Wolff, 2011) in which motions of various points in the vocal tract are measured during speech. In the 1st experiment, the authors provided a baseline model indicating a relatively low performance with traditional automatic speech recognition (ASR) using only acoustic data from dysarthric individuals. In the 2nd experiment, the authors used various measures of entropy (statistical disorder) to determine whether characteristics of dysarthric articulation can reduce uncertainty in features of dysarthric acoustics. These findings led to the 3rd experiment, in which recorded dysarthric articulation was directly encoded into the speech recognition process. RESULTS The authors found that 18.3% of the statistical disorder in the acoustics of speakers with dysarthria can be removed if articulatory parameters are known. Using articulatory models reduces phoneme recognition errors relatively by up to 6% for speakers with dysarthria in speaker-dependent systems. CONCLUSIONS Articulatory knowledge is useful in reducing rates of error in ASR for speakers with dysarthria and in reducing statistical uncertainty of their acoustic signals. These findings may help to guide clinical decisions related to the use of ASR in the future.
منابع مشابه
تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت
The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...
متن کاملThe Effects of Culture and Gender on the Recognition of Emotional Speech: Evidence from Persian Speakers Living in a Collectivist Society
This paper reports on a behavioral study that explores the role of culture and gender in the recognition of emotional speech in an under investigated cultural context (a collectivist society: i.e., Iran). Participants were asked to recognize the emotional prosody of a set of validated emotional vocal portrayals (including the five basic emotions). Findings of the experiment were then comp...
متن کاملEffects of Voice Therapy on Vocal Tract Discomfort in Muscle Tension Dysphonia
Introduction: Patients with muscle tension dysphonia (MTD) suffer from several physical discomforts in their vocal tract. However, few studies have examined the effects of voice therapy (VT) on the vocal tract discomfort (VTD) in patients with voice disorders. Therefore, the aim of the present study was to investigate the effects of VT on the VTD in patients with MTD. Materi...
متن کاملTowards a noisy-channel model of dysarthria in speech recognition
Modern automatic speech recognition is ineffective at understanding relatively unintelligible speech caused by neuro-motor disabilities collectively called dysarthria. Since dysarthria is primarily an articulatory phenomenon, we are collecting a database of vocal tract measurements during speech of individuals with cerebral palsy. In this paper, we demonstrate that articulatory knowledge can re...
متن کاملAvicenna's Anatomical Legacy as Seen Through the Relevant Topics in Modern Anat-omy
Background: Makhaarej Al-Horouf, the study of speech sounds by Avicenna is a valuable piece of work in the study of speech sounds, which was written about ten centuries ago. It contains six chapters on sound, anatomy of vocal tract, and phonetics. It is amazing to find that Avicenna’s explanations are congruent with the findings of modern scholarship in relevant topics. The study was intended t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of speech, language, and hearing research : JSLHR
دوره 55 4 شماره
صفحات -
تاریخ انتشار 2012